Merging Locally Correct Knowledge Bases: 
A Preliminary Report 



Paolo Liberatore 
Dipartimento di Informatica e Sistemistica 
Universita di Roma "La Sapienza" 
Via Salaria 113, 00198, Roma, Italy 
Email: paolo@liberatore.org 

Abstract 

Belief integration methods are often aimed at deriving a single and consis- 
tent knowledge base that retains as much as possible of the knowledge bases to 
integrate. The rationale behind this approach is the minimal change principle: 
the result of the integration process should differ as less as possible from the 
knowledge bases to integrate. We show that this principle can be reformulated 
in terms of a more general model of belief revision, based on the assumption 
that inconsistency is due to the mistakes the knowledge bases contain. Cur- 
rent belief revision strategies are based on a specific kind of mistakes, which 
however does not include all possible ones. Some alternative possibilities are 
discussed. 

1 Introduction 

Most of the existing belief revision semantics are based — in some way — on a prin- 
ciple that has been formulated at the very beginning of the investigation on this 
topic: the minimal change principle [1, 6]. According to this principle, the result of 
integrating two or more knowledge bases should be as similar as possible to them. 
Semantics proposed for merging agree on this principle, and only differ in the way it 
is applied, i.e., in how to combine the several possibilities arising, in how to measure 
the difference between knowledge bases, in how the knowledge bases are represented, 
in what is the relative reliability of sources, etc. Nevertheless, very few arguments 
against the basic principle exist [16]. 

This paper does not contain arguments against the minimal change principle, 
but only as it being a first principle. Taking a different perspective, we show that 
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it is indeed a particular consequence of a more general assumption. Namely, we 
present a model of how the knowledge bases to integrate are obtained that justifies 
the minimal change principle, as it is currently applied, only in particular cases. This 
model explains inconsistencies between knowledge bases by assuming that mistakes 
have been done in the process of knowledge acquisition. 

This model is not completely new, as existing merging semantics actually rely on 
its particularization to the case in which mistakes arc changes of value of literals. 
For example, Dalal's revision semantics [3] can be reformulated as the result of 
assuming that one knowledge base is free of mistakes, and the other one results from 
introducing mistakes in the value of literals in an otherwise correct knowledge base. 
In formulae, while revising K with P, we assume that the process of acquiring P 
is error-free, while K contains some mistakes, each changing the value of a single 
literal in a model. Therefore, Dalal's revision can be reformulated as the correction 
of a minimal number of mistakes. Other belief revision semantics are based on the 
same principle, but have different rules for combining the different possibilities that 
arise [17, 5, 2]. Iterated belief revision semantics [18, 12], updates [11, 8, 7], and 
merging/arbitration operators [13, 9, 14, 15], are based on similar principles. 

The model proposed in this paper, however, does not only formalize existing 
semantics; being more general, it is applicable to other scenarios, leading to different 
revision techniques. While cases like the example of the stock market experts [9] 
are perfectly modeled in the "mistake of value" model, other ones are not. Some 
examples, like the following one, comes from everyday life. 

Example 1 Yesterday, I met an old friend I have not been seeing in years. While 

talking about the high school days, we shared information about other friends we knew 
at that time. In particular, he told me that George earned a lot of money by creating 
a startup company he then sold, and now he lives in the Nukunonu island. On the 
other hand, I knew that George become incredibly rich with some illegal business, and 
he is currently in jail (but I do not know whether he still has some of the money.) 

The union of our knowledge bases is inconsistent, as there are no jails in the 
Nukunonu island. On the other hand, both of us are completely certain of our current 
knowledge. We then had to conclude that we were talking about two different Georges. 
The correct conclusion of merging information should then be that "George^A is rich", 
"George^A lives in the Nukunonu island", and that "George^B is in jail". 

Merging based on the minimal change principle, combined with the "mistake of 
value" assumption as it is usually done, would have led to a completely different 
result. Namely, since we assumed that we are talking about the same George, and 
since both of us have the same confidence on our knowledge, we could only conclude 
that either "George is in jail" or that "George lives in the Nukunonu island", but 
not both (since no jail is in the Nukunonu island.) This is already a problem, as this 
information is not complete about George 's current location, while in fact we both 
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know exactly where the Georges are. Still worst, since I do not know whether George 
is still rich while my friend is sure he is, I will incorrectly conclude that the George 
I am talking about is still rich, a fact that is not backed up by any evidence. 

This scenario is about a common life incident, but similar problems are common 
in computer science: putting together two T^TfrjX source files creates the problem of 
the same name for two different macros; similar problems arise in compiling C code 
fragments, etc. In the rest of the paper, we make the simplifying assumption that 
each knowledge base is the knowledge of a different agent involved in the process of 
merging. 

One of the characteristics of the example above is the "local" correctness of the 
involved knowledge base: both me and my friend had correct information about the 
George we were thinking about. The fact that each agent regards its knowledge 
base as correct, and then has to correct it during the merging process, is true in 
current belief semantics as well. However, the "mistake of value" model implies that 
the conclusions drawn by each single knowledge base were in fact incorrect. On the 
contrary, if the only mistakes are like the same name for two different objects, then 
the conclusions drawn from each knowledge base separately (before the merging) 
are correct, e.g., the conclusion that George cannot travel any more was correctly 
entailed by my knowledge base, and this is a correct conclusion, as I am referring to 
the George who is in jail. The correction to the knowledge bases is therefore only 
necessary to avoid inconsistency while merging the knowledge bases. 

While inconsistency is undoubtedly the most serious problem that may arise dur- 
ing merging, it is not the only one. There are mistakes that cannot be be discovered 
just by checking for inconsistency the union of the knowledge bases. Indeed, a mis- 
take does not necessarily create an inconsistency. On the contrary, some mistakes 
make the union of the knowledge bases weaker than it should be. An example of 
this case is when two knowledge bases give different names to the same object, which 
forbids drawing conclusions based on two facts contained in the two knowledge bases. 

Example 2 Still talking with my high school friend, I mentioned Teddy, who entered 
the Law school; I though that if he ever had graduated, he would have ended up in 
jail. The friend I was talking with, however, does not remember this Teddy, and the 
only guy he knows entered Law was Bobby, who actually graduated. In fact, Teddy 
was a nickname for Bobby, but we did not remember this fact. 

No inconsistency arises in this case. However, the conclusion that Bobby is 
(likely) in jail could not be drawn by simply putting together the knowledge we had. 
Gontrary to common merging scenarios, the conjunction of the knowledge bases is 
weaker than it should be. Such problems are clearly difficult to diagnose, as they do 
not create an inconsistency. The only way to find them out is from the fact that the 
resulting knowledge base is weaker than it should be. For example, knowing that only 
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one person from our class entered the Law school would have allow us to find out 
that Bobby and Teddy must be the same person. 



In order to produce a knowledge base in which as many mistakes as possible are 
corrected, we use two formulae that act as integrity constraints. Formally, we are 
given a multiset of knowledge bases /C and two formulae A and B; the result of 
the integration process is a formula K = X^/C such that K \= A and K A B ^ _L. 
This way, we constraint the resulting knowledge base to have (at least) a specific 
set of consequences A, and not to have some undesired other consequences B. The 
formula A formalizes the usual integrity constraints (facts that should remain true 
after integration), while B extends the usual consistency requirement: B = T only 
enforces the result of integration to be consistent. Since K = X^/C is a formula whose 
set of models is contained in Mod{A), and is not contained in Mod{^B), we call A 
and B the upper and lower bound of the merging operator, respectively. 

If the union of the knowledge bases of /C implies A and is consistent with B, 
we assume that there is no problem, i.e., the knowledge bases do not contain any 
mistake. This assumption may be wrong anyway, but we have no way to realize 
it. The interesting case is when either constraint is not satisfied. In this case, we 
assume that some mistakes have been made while acquiring the knowledge bases. 
Some possible mistakes are listed below. The three last mistakes of the list are the 
only ones leading to a locally incorrect knowledge base. 

homonymy: two agents use the same variable while they should use two different 
ones; 

synonimies: two agents use different variables while they should use the same one; 

subject misunderstanding: a formula is stated using one variable, while it should 
use a different one; 

extension: a formula F is extended to another variable or set of variables: formally, 
the agent assumes F[X/Y] in addition of F; 

generalization: a formula is assumed to hold in general, while it holds only under 

some assumptions; 

pca"ticularization: a formula is assumed to hold only in a specific scenario, while 
it is more general; 

ambiguity: a formula containing a V 6 is assumed to be a alone (or b alone, or both 
a and b); 

exclusion: a formula containing an inclusive or is taken to refer to the exclusive or, 
or vice versa; 
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value: the formula is correct because it contains a model with a wrong value. 

Besides the mistake of value, these mistakes can be grouped in three categories: 
mistakes due to a wrong interpretation of variables (homonymies, synonimies, and 
subject misunderstanding); mistakes due to a wrong interpretation of context (gen- 
eralization, particularization, and extension); mistakes due to a wrong of the logic 
(ambiguity and exclusion). 

Some other mistakes are particular cases of the above ones. For example, an 
agent may incorrectly assume that a previously true fact continues to hold while it 
does not: this is a subcase of incorrect generalization. Another similar mistake is 
the incorrect simplification of a definition, like "the water boils at 100°C" instead of 
"the water boils at 100°C at sea level". 

In the domain we consider, each agent introduces some mistakes into a truly 
correct knowledge base. This is modeled by assuming that each agent modified its 
original knowledge base in some way. Clearly, this is only a theoretical model: if the 
agent ever had a correct knowledge base, it had not modify it. However, this way 
we can say that "the agent modified the knowledge base", that simplifies the more 
correct sentence "the agent incorrectly considered the information x to be y" . 

Merging is the process of first correcting mistakes in the knowledge bases, and 
then conjoining them. Correcting mistakes, in turns, is a two-phase process: first, 
we have to find out which mistakes have been made, and then correcting them. 
We initially assume that an ordering of likeliness of mistakes is known, and then 
consider the problem of how to derive it from the knowledge bases. In this second 
case, however, we cannot expect the merging process to do much, given the high 
number of possible mistakes: for example, the multiset {a, -la} may be inconsistent 
because the second a should be b, or because a is only true when b is true (that is, 
the first formula should be 6 — > a instead of a alone), or because the ambiguity a V 6 
of the first formula has been interpreted as a choice, and the agent has incorrectly 
assumed a, etc. The number of possibihties increases with the number of variables 
and with the size and complexity of the knowledge bases. The process of correcting 
the mistakes can also be problematic: knowing that a formula has been obtained by 
changing a name is not enough if we do not know the original name. 

We make some simplifications. The first one is to neglect the mistakes of logic 
(ambiguity and exclusion). The second one is to restrict our study to prepositional 
knowledge bases. While first-order logic (even without function symbols) is uncom- 
mon in belief revision studies, this assumption makes us disregard the very relevant 
case of cpistcmic bases [10], which contain not only the agent's belief, but also what 
it considers more or less plausible. 
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2 A Model of the Sources 



In this section, we give a formal definition of our framework. The general belief 
merging process can be visualized as in Figure 1: there are a number of agents 
(sources) each sending a knowledge base to a centralized "knowledge merger" . We 
do not consider the more sophisticated models that are sometimes used (e.g., an 
agent supplies more than one knowledge base.) 
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Figure 1: The basic model of belief merge. 



We improve over this simple schema by providing a model of how the sources get 
the knowledge bases Kj's they pass to the merger: each Ki is obtained by applying 
one or more transformations to a knowledge base Si, which is assumed to be correct. 
Figure 2 is a graphical representation of this model. 
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Figure 2: The model of belief merge, with mistakes. 
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The "mistake of value" revision semantics fit in tliis model: each Ki is obtained 
from Si by applying the transformation that changes the value of a variable in a 
model. Namely, let r]^ be the transformation that takes a formula, and gives an- 
other formula in which the model M is replaced by the model with the opposite value 
of X. Each Ki is obtained from Si by applying a suitable number of such transfor- 
mations. Specific revision/arbitration/merging operators can be then formalized by 
assuming a form of minimality of the mistakes, and then combining in some way the 
possible results of this assumption. 

For example, Dalal's revision assumes that a. one of the knowledge base is correct 
(no transformation has been applied to it); b. the other knowledge base results from 
the application of a number of transformations ^ to a correct one; and c. a minimal 
number of transformations have been applied. If more than one knowledge base result 
from inverting these transformations, they are disjoined. This semantics fits into the 
proposed model: the Ki's are obtained by applying transformations to the SiS, and 
the process of integration attempts to invert them. 

Formalizing Dalal's revision in this way shows how integration can be done in 
general: inverting the transformation applied to Si, and merging what results. Ide- 
ally, we should be able to obtain the knowledge bases Si, which are assumed correct. 
Unfortunately, inverting the transformations cannot be done uniquely, as the merger 
only knows the KiS, but has no direct knowledge of the transformations used or the 
original S'j's. For example, Ki — a may be correct, or may be the result of changing 
a variable name to Si — h, or may be a wrong generalization ol Si — c ^ a, and so 
on. 

The mistakes listed in Section 1 can be formalized by the following transforma- 
tions. 

variable substitution: T^^y{F) = F[x/y]; 

generalization: t^{F) — F[x/ true]; 

particularization: r^F) = x — > F; 

Variable substitution models all mistakes due to mistakes relative to variable 
names: homonymies, renaming, and subject misunderstanding. Wrong generaliza- 
tion is the mistake of neglecting some assumptions of an (otherwise true) fact. This 
can be formalized by taking the original (correct) formula F, and replacing the as- 
sumption X with true. Note that the resulting formula t^{F) does not contain x 
at all, but has exactly the models F would have if x is true. The simplest case 
of generalization is when x F is taken to be F: if F does not contain x, then 
F — T^{x — > F). However, also models more complex cases of generalization. 
Particularization is easy to formalize: some assumptions are believed to be required 
for some fact to hold, while they are not. Generalization and particularization can 
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be, to some extent, been considered the opposite of each other, since t^{t^{F)) = F. 
However, the converse does not hold, as it may be t'^{t^{F)) ^ F\ this is the case, 
for example, if F does not mention x at all. We neglect mistakes of logic, that is, 
ambiguity and exclusion, as they are too hard to detect and invert. Especially am- 
biguity is difficult to detect without a lot of additional information: given a formula, 
it may be that each of its subformulae was originally disjoined with another formula 
(that may be an arbitrary formula of the domain). Even restricting to literals, the 
number of possibilities makes the problem quite difficult. 

3 The Merging Process 

The merging process consists in inverting the transformations, and then putting to- 
gether the resulting knowledge bases. Since we only have the knowledge bases KiS 
after the changes, we do not know for sure which transformations are the ones to 
invert. Extending the principles used for revision and arbitration, we make some 
hypotheses about the kind of mistakes that have been made. Considering only the 
most likely possibilities, we are still left with a number of possible scenarios. For each 
of them, however, we know how to invert the transformations and obtain the original 
knowledge bases Sj, which can be then conjoined to get the maximum possible infor- 
mation. What result is the merged knowledge base in one of the possible scenarios 
we assumed. Therefore, we have one knowledge base for each scenario: since these 
are alternative possibilities, the right way of combining them is by disjunction. 

Formally, we begin with the knowledge bases Ki,K2, ■ ■ ■ , and make an as- 
sumption about the transformations that have been used to obtain them. Invert- 
ing these transformations, we obtain K[,K2, . . . ,K'^. If the assumption about the 
transformations is correct, the best way of merging them is simply by putting them 
together, thus obtaining K = K[ A K'2 A ■ ■ ■ A K'^. 

On the other hand, this is only a possible scenario. In another scenario, we may 
get a different result of merging K^, in another one we may have yet another result 
K^, etc. Since these are the results of considering different alternatives we consider 
equally likely, the final result of merging should be the disjunction (logical or) of 
them. 

Figure 3 shows this process. Finding and inverting the transformations are central 
steps of this process: on the one hand, we should select as few possible scenarios as 
possible to avoid a too weak result; on the other hand, including too few possibilities 
may lead us to neglect the one that really represents the state of the world. 

For simplicity, we replace these first two steps of the process by the one of finding 
one (or more) n-tuples of inverse transformations, one for each knowledge base. 
Indeed, finding the transformations that have been applied and inverting them can 
be formalized by the single step of finding the transformations that lead from the 
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Figure 3: The merging process 



knowledge bases we have to the original ones; we call them "inverse transformations" 
simply because they invert the transformations that have been previously applied, 
but they are still the transformations previously considered, like variable substitution, 
etc. 

In order to select one (or more) n-tuple of inverse transformations, we define 
an ordering over all possible n-tuples of sets of transformations. This way, we can 
compare a possible scenario with another one, and tell which one is the most likely. A 
different and simpler model is that in which there is one ordering for each knowledge 
base. We do not adopt this model because mistakes in one knowledge base should be 
ranked not only according to that source, but also as a result of comparing it with 
the other knowledge bases. This is why we consider an ordering ranking n-tuples 
rather than comparing transformations locally, i.e., source by source. 

This ordering may originate in different ways: it can be part of the knowledge 
of each source (that is, each agent has its own idea of the mistakes it hkely makes) , 
or it can be an information the merger has (possibly based on the meaning of the 
literals and other related knowledge), or it is derived from the knowledge bases KiS 
using some heuristics. In the first two cases, we can simply assume that the ordering 
is given; the problem of obtaining it from the knowledge bases is discussed in the 
next session. Either way, in the rest of this section we assume that this ordering is 
given. In particular, we assume that 7?. is a function that associates an integer to 
each n-tuple of sets of transformations, giving the likeliness they correct the mistakes 
in the knowledge bases J^i, . . . , Kn- As is common in belief revision, we interpret a 
lower rank as an higher degrees of likeliness, and therefore prefer n-tuples with the 
lowest rank. 
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The set of possible transformation that may have been used for generating Ki 
from Si is defined as foUows: 

= {T^^y I y e VaT{Ki), x ^ VaT{Ki)} U {rf I ^ ^ VaT{Ki)} U {r^ \ h K^} 

For example, may result from replacing x with and this is why the renaming 
of X with y is in the the first part of T{Ki) only if y is mentioned in Ki while a; is not. 
The other parts of T{Ki) are motivated in a similar way. The set T{Ki) is potentially 
infinite, as there are potentially infinite possible variables x ^ Var{Ki). For example, 
if Si = xV y, and the source renamed x with z, it ends up with Ki = y. Inverting 
this transformation amounts to deciding which name z originally had, and this is 
impossible by looking at Ki only. When a variable disappears from a knowledge 
base, like in this case, we either use a variable that appears in another knowledge 
base, or introduce a new one. This limits the set of possible transformations: when 
we write x ^ Var{Ki) we assume that either a; is a variable occurring in some other 
knowledge base, or x is a new variable created on purpose. 

In order to invert the transformations, we define an inverse relation Inversei{Ti, T2), 
which relates two transformations ti and T2 in such a way r2 undoes the changes made 
by Ti on the knowledge base Ki. Note that InversCi is indexed by i, thus making 
this relation dependent on the considered knowledge base. However, only the names 
of the variables in Ki are really needed. Also note that Inversci is not a function, as 
renaming and generalization cannot be uniquely inverted. This relation is formally 
defined as follows. 

InversCi = {{rly.r'^^,) \ r^y G T{Ki), x e Var{Ki), z ^ Var{Ki)} U 
{{r^, rl) I Tl e T{Ki), y ^ Var{Ki)} U 
{{rS,r^)\T^er{Ki), y^Var(Ki)} 

The relation InversCi defines the set of all possible inverse transformations on 
the knowledge base Ki. Since there are too many such transformations, we also 
consider the ordering that tells their degree of likeliness. This ordering is formalized 
as a functions from n-tuples of sets of transformations to integers. Formally, an 
integer is associated to each subset of Inversci x ■ ■ ■ x InversCn. The idea is that 
each subset of this set contains a set of transformations for each knowledge base; 
imphcitly, it tells the mistakes that have been done. The ordering simply tells the 
degree of likeliness of these mistakes. We denote this function as TZ. 

This ranking makes the process of merging possible. As it is common in belief 
revision, we consider all possible changes to the knowledge bases, select only the ones 
that lead to the expected result {A should be derivable but should not), and then 
use the ranking to further reduce the set of possibilities. 

In order to define the first step (selection of transformations), we have to specify, 
for each n-tuple of sets of transformations, what is the resulting knowledge base. Let 
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therefore . . . , £„) be this n-tuple, where C {r | 3r' . (r', r) e Inversci}. The 
result of applying the transformations in Ci to is as follows: 

Tci{Ki) = Ti(. . . (Tn(-frj)) where A = {n, . . . , t„} 

We extend this operator to tuples of set of transformations and to tuples of 
knowledge bases as follows. 

This is the result of merging only if £1, . . . , is known to be the way in which 
the transformations have to be inverted, or it is the only way in which both the 
constraint on A and the constraint on B can be satisfied. Usually, this is not the 
case, so we have to use the ranking TZ to make a selection. 

The transformations we consider are the minimal ones among those making the 
result of merging to imply A but not to imply ->B. Minimality is defined using the 
ranking. 

Mr = mm({(A,...,£„) I A %....,i:„>^ h ^ and A (^^(a,...,/:^)^) A 5 ^ ±}, ) 

i=l,...,n i=l,...,n 

This formula defines a set of transformations for each source. Clearly, there is no 
warranty that such a minimum is unique. The merger applies each set of possible 
transformations, and disjoins the results: 

By construction, X^{1C) implies A simply because it is a disjunction of terms, 
each implying A. For the same reason, since each term is consistent with so is 
the result of merging. 

4 Selection Heuristics 

The merging process outlined in the last section depends on which the most likely 
transformations are. So far, we simply assumed the knowledge of the ordering TZi, 
either because it is an additional information the agents have, or because it is known 
to the centralized merger. However, the case in which no additional information, 
besides the knowledge bases, is known is also important. In this section we consider 
the case in which no information about the likeliness of mistakes is given, and the 
ordering must be drawn from the knowledge bases Ki. 

While it is always theoretically possible to select all possible transformations that 
satisfy the constraints A and B, these transformations may be too many to give useful 
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information. Indeed, the more the possible considered scenarios are, the weaker the 
resulting knowledge base is, and the number of possible scenarios may be very large 
even for very simple knowledge bases. For example, the knowledge base Ki = x ^ y 
may result from the renaming of z to x, or from the particularization of y (i.e. we 
incorrectly assumed that y holds only when x is true), or from the generalization of 
x/\z — > y, etc. A first selection criteria is that we only accepts sets of transformations 
that produce a knowledge base that satisfies the upper and the lower bounds. This, 
however, may be still too weak a constraint to limit the number of transformations. 

For this reason, we also assume some minimality criteria; namely, we assume 
that as few mistakes as possible have been made while producing from S^. In a 
sense, this is the minimal change principle in disguise: assuming a minimal number 
of mistakes, we still consider a minimal number of (inverse) transformations to be 
applied to the knowledge bases. On the other hand, the minimal change principle in 
this form is not a first principle any longer, but only a consequence of a more general 
assumption. 

The principle of minimizing the number of mistakes/transformations, however, 
may still be not enough, that is, the number of possible scenarios may still be too 
high. Therefore, we use the knowledge bases to further limit the number of possible 
alternatives. In this section, we present a selection heuristics that is based only on 
the knowledge bases. We assume that no further information is given about the 
meaning of literals, the likeliness of mistakes, etc. and that we cannot perform any 
information-gathering actions (a common assumption in belief revision, less in the 
real world.) 

Another problem of the merging process is that some transformations cannot be 
inverted uniquely. In particular, knowing that Ki — T^{Si) does not allow to derive 
Si. In such cases, we simply assume that Si = r^{Ki). This is equivalent to assuming 
that r| is only applied to formulae like x ^ F, i.e., having precondition. 

In order to define this ranking TZ, we observe that it only needs to rank the 
transformations according to their plausibility, regardless of whether they lead to 
satisfy the lower and upper bounds of merging: it is the merging process that enforces 
these constraints to be satisfied. 

The ranking TZ is based on (besides assuming a minimal number of mistakes,) 
assuming that the initial knowledge bases S'j's are similar to each other. Therefore, 
the best inverse transformations are those making the resulting knowledge bases K'^ 
as similar to each other as possible. 

Examples justifying this way of operating are easy to find: if a knowledge base is 
identical to another one except for a different variable name, the change of the name 
is intuitively the most reasonable action to do before integrating the two knowledge 
bases. 

This example can be generalized to the case in which applying a transformation 
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to Ki makes it equal to Kj-. this transformation is likely to be the inverse of the 
one that changed S-i to Ki. In this case, Si = Sj, but is not always the case. To 
make this criteria to have general applicability, we need a way for applying it even 
when the two knowledge bases cannot be made identical. To this aim, we measure 
the similarity between knowledge bases, and trade off between the number of inverse 
transformations and the degree of similarity of the resulting knowledge bases. We 
therefore need a way for measuring the similarity between two knowledge bases, and 
then a way for combining this measure with the number of changes needed to make 
the knowledge bases similar. 

The measure of similarity can be defined either syntactically or semantically; we 
define a semantical measure. There are two reasons for this choice: first, it is possible 
to express the same knowledge in different ways (so that Si and Sj, while identical 
in their sets of models, are syntactically different); second, each source may have 
further changed the syntactic form of its knowledge base to suit its purposes. 

Let Ki and K2 be two knowledge bases, and let Mod{Ki) and Mod{K2) be their 
sets of models. The measure of similarity should grow as the size of the intersection 
Mod{Ki) n Mod{K2), and as the intersection of their complements Mod{-^Ki) fl 
Mod{-^K2). The total size of these two sets is in fact equal to \Mod{Ki = K2)\. 
The degree of similarity should also decrease with the number of models that satisfy 
only one formula, that is, the size of Mod{Ki ^ K2). A possible choice is the hnear 
combination of these two measures: 

5{Ki,K2) = \Mod{Ki = K2)\ - \Mod{Ki ^ K2)\ = 2*\Mod{Ki = K2)\ - |Morf(true)| 

This function is in practice the same as \Mod{Ki = K2)\. Another possibility is 
that of using a quotient: 6{Ki, K2) = \Mod{Ki = K2)\/\Mod{Ki ^ K2)\. 

Having defined the measure of similarity 5 of two knowledge bases, we can now 
combine it with the number of transformations to define the ranking. Let us there- 
fore consider a specific n-tuplc {Ci, . . . , The knowledge bases generated by the 
transformations are IcX^d- We compare them using 6 and the number of transfor- 
mations in each set £j. We use a simple linear combination of these two measures. 

7^((A,...,£„))= Yl log{S{IcAKi),IcAKj)) + l) + J2\^i\ 

Ki,Kj 

The logarithm is used to make the measure of similarity and the number of 
transformations to be on the same scale: without it, the measure of similarity can be 
exponentially large, thus making the contribution of the number of transformations 
irrelevant. We used a logarithm (instead of a multiplying factor) because a difference 
of distances should be less important when the number of different models is high: 
the difference between one model and two is more important than the difference 
between 1000 and 1001. 
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This ranking TZ defines a measure of goodness of transformations, and therefore 
completes the merging process outhned in the previous section: given the knowledge 
bases K^s, we can now tell exactly what the result of merging is. 

A problem of this ranking, however, is that it is based on assuming that all knowl- 
edge bases derives from similar knowledge bases by applying some, equally 
likely, transformations. While the equal likeliness is the natural result of assuming 
no information about the likeliness of transformations, the assumption that the SiS 
are similar is questionable. In particular, it may be more reasonable to assume that 
each knowledge base is "targeted" to a different subject. Indeed, it is likely that each 
source uses the knowledge base for a specific purpose; as a result, the knowledge bases 
may contain only information about some specific subjects. 

To take this consideration into account, we do not measure how similar the knowl- 
edge bases are, but how similar they are when restricted to a subset of variables. 
Namely, let be the restriction of the formula K to the variables in Y . The 
difference between two formulae Ki and K2 is: 

5{Ki,K2)= ^ _ lyi ^1 

This is how we formalize the assumption that the result of the inverse transfor- 
mations may be a formula that is similar to the other one only for a subset of its 
variables. The quotient is defined in such a way to avoid a difference in the case 
|y| = 1 to count the same as in the case Y — X. 

5 The Renaming Merging Operator 

The ranking defined in the previous section allows for determining the result of 
merging from the knowledge bases alone, without any additional information. In 
the belief revision terminology, this is a merging operator, as opposed to merging 
schemas, which require some additional information such as ranking, preferences, 
etc. (they are called schemas because they are the backbones of a merging process, 
but something has to be added to make them complete merging operators.) 

The operator defined in the last section allows for checking the validity of prop- 
erties that should hold for the merging process. However, the number of possible 
transformations make the operator quite complicated. We therefore make the sim- 
plifying assumption that the only mistakes are those involving renamings. The set 
of possible transformations is therefore defined as follows. 

Definition 1 Given a set of variables X, a permitted inverse transformation is a 
substitution X/Y in which each Xi is either substituted with another variable in X, 
or it is renamed as the new variable x\. 
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This definition forbids the proUferation of new variables: if we have to replace 
Xi with a new variable, we are forced to name it x[. This rule limits the number 
of choices while renaming variables. Intuitively, if we have to change the name of a 
variable, this rule allows not to care about the name of the new variable. 

The merging operator is based on a particularization of the general model of 
the sources; namely, the only considered transformations are renamings. Therefore, 
in order to define a specific merging operator, we only need an ordering over the 
renamings. Assuming all transformations equally likely we get a merging operator 
we call Renaming Merging with Equal Likeliness Operator, or RMEL for short. For 
the sake of clarity, we only consider two knowledge bases, as is common in the 
merging/ arbitration literature. 

Definition 2 The Renaming Merging with Equal Likeliness Operator *rm£l asso- 
ciates any two knowledge bases Ki and K2 to another knowledge base Ki ^rmel -^2 
defined as follows: 

{Y,Z)ePIT 

where {Y, Z) G PIT if and only if X/Y and X/Z are permitted inverse transforma- 
tions that satisfy Ki[X/Y] A K2[X/Z] ^ A and Ki[X/Y] A K2[X/Z] A B ^ ± and 
are of minimal combined size ( that is, the size of Y plus that of Z is minimal.) X 
is a subset of the variables in Ki, K2, A, and B. 

In this definition, we consider renamings to the variables in Ki and in K2 that 
satisfy the bounds A and B. Using only permitted inverse transformations reduces 
the number of disjuncts in the definition. Indeed, for each variable in each of the two 
knowledge bases, we can cither substitute it with another variable in X, or with a new 
variable not appearing anywhere else. The use of new variables is necessary as the 
two knowledge bases may use the same variables for different facts, so that either one 
or both of them have to be renamed. Using only permitted inverse transformations 
we avoid the problem of having to consider transformations that differ only for the 
name of the new variables, since the name of new variables is defined uniquely. On 
the other hand, permitted transformations are liberal enough to allow for making 
the alphabets of XifX/y] and K2[X/Z] disjoint (just substitute each Xi with x'^ in 
Ki, and make no changes to K2.) This may be necessary when the two knowledge 
bases use exactly the same variables to represent completely different facts. 

The rule of minimality excludes transformations that introduce renamings that 
are not justified. This particular ordering is the one that reduces the number of 
renamings the most, but other rules can be used instead: minimality w.r.t. set con- 
tainment, minimal size of Y and Z considered separately, user supplied ranking of 
transformations, etc. 
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Let us consider some properties of this operator. We assume that both A and 
B are consistent, and do not contradict each other. This is important: for example, 
if A = a and B = -la, there is no way to make A imphed and B consistent at 
the same time. We therefore assume that A, B, and A A B are aU consistent. A 
good property this merging operator should have is that of success, that is, it should 
produce a meaningful result. In our case, since merging is defined as a disjunction 
of possible hypotheses, this amounts to checking whether the resulting knowledge 
base is consistent. Unfortunately, this may not be the the following example 

shows: 

Ki = -.xi 

K2 = -'X2 

A ^ xi 
B = T 

The problem here is that the two knowledge bases both tell that something is false, 
while we wanted a variable to be true after the merging, as ^4 = Xi. The problem 
could be overcome by considering transformations involving negative literals, but this 
is quite unintuitive in this case: if we assume that the only problem is that we are 
giving the wrong name to a fact, we cannot infer that a fact is true from a statement 
saying that a fact is false. 

The reason of why we cannot get success in this example is that the operator is 
based on assuming that the knowledge bases are obtained by renamings of correct 
ones, but the knowledge bases Ki and K2 of this example contradict this assumption. 
Indeed, Ki = -iXi cannot be the result of changing a name to a knowledge base that 
implies Xi. Obtaining a consistent result from the knowledge bases above would 
therefore be counterintuitive, as the merging operator would be saying that the 
assumption on the transformations (only name changes are possible) is consistent 
with the available data, while in fact it is not. 

This example shows that we cannot expect the merging operator to work correctly 
even when the assumptions it is based on do not hold. On the contrary, the properties 
of this operator have to be checked with respect to two knowledge bases Ki and K2 
that actually result from renaming some variables in two knowledge bases and 5*2, 
both consistent with B and both implying A. 

If this is the case, the transformations can be inverted, and therefore the bounds 
A and B can be satisfied. It does not matter that the inverse transformation is not 
unique: to achieve derivability of A and consistency with B, all that is needed is that 
there is at least a pair {Y, Z) such that Ki[X/Y] A K2[X/Z] |= A and Ki[X/Y] A 
K2[X/Z]AB ^ B. All other disjuncts involved in the definition (if any) are consistent 
with B, and therefore their disjunction is consistent with B as well. The upper bound 
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A is satisfied for the same reason: since each element of the disjunction imphes A, 
all of its models are models of A. 

A second property that we wish to obtain is that the original knowledge is cor- 
rectly, even if not completely, recovered. This is to say that, if A ^2 ^ C, then 
the result of merging Ki with K2 should not imply C either. However, this is not 
always the the following example shows. 

Ki = xi {Si = X2) 
K2 ^ T (52 = T) 

A ^ T 

B ^ T 

This example clearly shows a problem that has been already mentioned in the 
introduction: if we have no way to realize that a mistake has been made, then there is 
no way to recover from it. In this case, assuming that both knowledge bases are free 
of mistakes is not inconsistent with the bounds A and B. Therefore, K1AK2 — Xi is 
the result of merging simply because we have no reason to assume that a name change 
is necessary. This conclusion is incorrect, as is not a consequence of Si A S2 = X2. 
This example also shows the obvious fact that we cannot enforce completeness either: 
X2 is a consequence of the original knowledge base, but is not a consequence of the 
result of merging. 

The fact that we cannot always recover the original knowledge bases, however, it 
is not unique to this operator. Even in the "mistake of value" assumption (that is, 
in "traditional" belief revision operators), the way the result of merging is related 
to the real world is conditioned to the validity of the minimal change principle. To 
make a concrete example, if our real world is a A 6, and we have to revise K — -la to 
P — b, we will always get the incorrect conclusion -la. This is simply because: 

1. there is no evidence we need to make any change; 

2. we commit to the principle of making as few changes as possible. 

In our scenario, we do not have any evidence that makes us thinking that a mis- 
take has been made, and we therefore assume that the knowledge bases are correct. 
Making any other choice without any additional justifying information would be 
unmotivated. 

We now consider the operator obtained by adding the ranking over transforma- 
tions defined in the previous section. The definition of ranking specializes to the case 
of renamings only as follows. 
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Definition 3 The degree of a permitted inverse transformation X/Y, X/Z w.r.t. Ki, 
K2, A, and B is given by the following formula: 

n{{X/Y,X/Z)) = \X/Y\ + \X/Z\+log{5{K,[X/Y],K2[X/Z])) 

In words, this ranking combines tlie number of name clianges witli tlie similarity 
of tlie knowledge bases after the changes (the similarity measure 6 can be defined as 
shown in the previous section.) In this case, we have used a simple linear combination, 
but other combinations are possible (for example, we can first consider the number 
of changes, and then the similarity only in case of ties.) 

A. B 

Definition 4 The Renaming Merging Operator >k^^ associates with any two knowl- 
edge bases Ki and K2 another knowledge base Ki K2 defined as follows: 

Ki4'^K2= V K,[X/Y]AK2[X/Z] 

{Y,Z)eMPIT 

where {Y,Z) e MPIT if and only if X/Y and X/Z are minimal permitted inverse 
transformations w.r.t. Ki, K2, A, and B. 

This operator differs from the previous one only in that the similarity between 
the two knowledge bases is taken into account, and it is in the same degree as the 
number of substitutions. 

The same drawbacks of the operator with equal likeliness appear here. The 
difference is that, using an ordering, we select less transformations. Thus, we have 
less terms in the disjunction, and therefore the result of merging can be logically 
stronger. 

6 Complexity Results 

In this section, we consider the complexity of inference for the renaming merging 
with equal likeliness operator. Formally, given Ki, K2, A, B, and Q, we want to 
check whether Q is implied by the merge of Ki with K2, where A and B are the 
upper and lower bound, respectively. 

Theorem 1 The problem of checking whether Ki *rmel K2 \^ Q is U.2-hard, and is 
in A3 [log n]. 

Proof. Membership: finding the size of the minimal renamings that make Ki and 
K2 consistent with B and not with -lA can be done with a logarithmic number of 
queries to an oracle that checks the existence of a substitution that satisfies both 
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constraints (this oracle must be in the second level of the polynomial hierarchy due 
to the upper bound: it has to check the existence of a substitution such that the 
resulting knowledge bases imply A.) 

Using the minimal size of substitutions, all is needed is to check whether the 
knowledge bases imply Q using all substitutions of minimal size that satisfy both 
constraints. 

Hardness is proved by reduction from V3QBF. We prove that VXSF.F is valid if 
and only if Ki *'^mel ^2 \= Q, where Q — a and 

Ki = a A Xi A ■ ■ ■ A Xn 
K2 — -la A -i^i A ■ ■ ■ A -i,x„ 

A = ay{^F[Y/Y,]A---A-^F[Y/Yn+i]) 

B ^ T 

In order to satisfy the lower bound, the substitutions must make Ki and K2 con- 
sistent. This is only possible by changing the name of each variable in {a, Xi, . . . , x„} 
either in Ki or in K2. This way, putting together Ki and K2, we obtain a formula 
that contains exactly one literal between Xi and -iXj and one literal between a and 
-la, that is, a formula having exactly one model over variables X U {a}. 

Changing the names this way is necessary to satisfy the lower bound B. Wc can 
also prove that n + 1 name changes are sufficient to satisfy the upper bound A. The 
substitutions that rename a in K2 are such that a is implied by Ki and K2 after the 
renaming; therefore, A is implied as well. As a result, exactly n + 1 variable name 
changes are needed to make both constraints satisfied. In particular, each variable 
in X U {a} has to be renamed in either Ki or K2: if the knowledge base that results 
satisfies A, then this substitution is considered. 

Let us first consider the case in which all variables in Ki and K2 are replaced 
with new ones. In order to make Ki and K2 consistent, we have to rename any 
variable in {a,xi,. . . ,Xn} either in Ki or in K2. After the change, Ki and K2 is 
a knowledge base with exactly one model. The substitution is considered only if A 
is implied by this model. By construction, A is implied only if either a is true, or 
the value of the variables {xi, . . . , x„} satisfy -iF for all possible values of Y. As a 
result, a substitution that makes a false satisfies the upper bound if and only if the 
corresponding evaluation of the variables X falsifies F for any possible assignment 
of the variables Y. As a result, Q = a is implied if and only if such assignments do 
not exist, that is, for all values of X, there is a value of Y that satisfy F. 

The reduction is proved only if wc restrict to substitution changing the name of a 
variable with a new name. Let us now consider the other substitutions. If a variable 



19 



Xi is renamed to Xj, all is said above still holds (as the variable Xj, in a way or another 
"disappears" from the knowledge base, and therefore it is set to the value it has in 
the other one.) The only substitutions that cause problems are those changing the 
value of Xi (or a) into a variable in Y . Let for example consider the case in which the 
substitution Xi/y\ is applied to Ki. Then, xi is set to false in the resulting merging. 
At the same time, however, yi is set to true as well. This is a problem if -^F is not 
satisfied by the values of X alone, but it is if yi is true: if a is set to false, we obtain 
that (5 = a is not implied any more, while we know that the partial evaluation of X 
does not satisfy F. This is why A contains n + 1 copies of F: however we change 
the names of variables in X to variables in Y , the upper bound A always contain 
a copy of F whose variables in Yi are not mentioned in Ki and K2 after renaming. 
This ensures that A can only be derived if the partial evaluation of X falsifies F. □ 

7 Conclusions 

The contribution of this paper is in the approach taken, rather than the proposed 
specific belief revision method. Starting from a very general model of the integration 
domain, we have shown that the existing semantics for knowledge integration corre- 
spond to a specific assumption. In this model, the sources get the knowledge bases 
they have by a process of acquisition that is prone to errors; previous integration 
semantics correspond to the assumption that mistakes are of a specific kind (which 
we called "mistakes of value"). Other mistakes are considered in this paper, leading 
to completely new integration semantics. The work reported here is still preliminary, 
as the properties of merging in the new models have not yet been fully investigated 
(comments and suggestions are welcome.) 

New issues come from further generalizing this model. For example, we have only 
considered the case in which all knowledge bases are propositional. For first order 
logic, new interesting cases arise: a form of generalization is to transform P{a) into 
\/x.P{x); the opposite of particularization is also interesting; subject misunderstand- 
ing is in this context different (it is the change of a constant, not the change of a 
literal), etc. 

Other issues arise from comparing the approach taken here with "classical" belief 
revision. In the usual formalization of belief revision, the knowledge expressed by 
each source is actually a set of preferences, rather than simply a knowledge base. 
This is because each agent involved in the merging process not only has some beliefs, 
but also acknowledges the possibility that they may be indeed false. As a result, it 
also has a measure of preference (degree of belief) over all facts it considers to be 
false, generating an ordering over the possible worlds. 

Modeling merging with the assumption of mistakes, such ordering cannot be 
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used. As it is clear from the heuristics presented, it is impossible to express merging 
as a merging of ranking, as the most likely transformations of each source depend 
on the other knowledge bases. It is also true that we could consider a more so- 
phisticated model accounting both rankings (expressing the measure of likeliness of 
worlds according to each agent) and mistakes (that each agent did while getting its 
knowledge) . 

Finally, let us briefly discuss the computation issues. The result of Section 5 
shows that the proposed semantics is computationally harder than the propositional 
calculus, as expected. Nevertheless, it is not much harder than most of the revision 
operators, that are 11^ complete [4]. A simphfied definition has been used, but it 
seems unlikely it did reduce complexity much. 
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